Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
📊 IVF Indexes
Specific
Inverted File Index, Vector Clustering, Quantization, ANN Search
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
27358
posts in
59.0
ms
OrionsLock/SALOMI
: Research code for extreme low-bit transformer quantization and inference.
🔤
Tokenization
github.com
·
21h
·
Hacker News
·
…
TurboQuant
: KV Cache Quantization to 3.5 Bits with Zero Accuracy Loss-
ICLR
2026
🗜️
Vector Compression
darshanfofadiya.com
·
4d
·
Hacker News
·
…
Linear
Regression
from 1-bit
Quantized
Data
🔢
Binary Embeddings
arxiv.org
·
1d
·
…
Tq-KV
– Rust implementation of
TurboQuant
that works on GGUF models
🎯
Qdrant
news.ycombinator.com
·
1d
·
Hacker News
·
…
Fused
INT8
Weight-Only Quantization in
Pallas
🔬
RaBitQ
rishirajacharya.com
·
3d
·
…
Fujitsu
One Compression (LLM
Quantization
)
🎯
Vector Quantization
fujitsuresearch.github.io
·
1d
·
Hacker News
·
…
75% of What a Neural Network
Learns
is
noise
. So is 75% of What You Learned in School.
🔢
BitNet
pub.towardsai.net
·
1d
·
…
AdaLoRA-QAT
: Adaptive Low-Rank and Quantization-Aware Segmentation
0
Binary Vector Embeddings
arxiv.org
·
21h
·
…
Brainstacks
: Cross-Domain Cognitive Capabilities via Frozen MoE-LoRA
Stacks
for Continual LLM Learning
🏗️
LLM Infrastructure
arxiv.org
·
21h
·
…
SliderQuant
: Accurate Post-Training
Quantization
for LLMs
🧠
LLM Inference
arxiv.org
·
6d
·
…
KV
Cache
Quantization
for Self-Forcing Video Generation: A 33-Method Empirical Study
🔬
RaBitQ
arxiv.org
·
2d
·
…
Proxima: Near-storage Acceleration for Graph-based Approximate
Nearest
Neighbor Search in 3D
NAND
💨
Cache-Friendly Algorithms
arxiv.org
·
2d
·
…
PQuantML
: A Tool for End-to-End Hardware-aware Model
Compression
🔬
RaBitQ
arxiv.org
·
3d
·
…
OneComp
: One-Line Revolution for Generative AI Model
Compression
📱
Edge AI Optimization
arxiv.org
·
1d
·
…
Once-for-All Channel
Mixers
(
HYPERTINYPW
): Generative Compression for TinyML
🔬
RaBitQ
arxiv.org
·
6d
·
…
Adaptive
Block-Scaled
Data
Types
📏
Linear Types
arxiv.org
·
2d
·
Hacker News
·
…
Octree-based
Learned Point Cloud Geometry Compression: A
Lossy
Perspective
🎯
Vector Quantization
arxiv.org
·
2d
·
…
Voxtral
TTS
✨
Gemini
arxiv.org
·
6d
·
…
RSR-core
: A High-Performance Engine for Low-Bit Matrix-Vector
Multiplication
⚡
Vectorized Execution
arxiv.org
·
2d
·
…
ITQ3
_S: High-Fidelity 3-bit LLM Inference via
Interleaved
Ternary Quantization with Rotation-Domain Smoothing
🧠
LLM Inference
arxiv.org
·
2d
·
…
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help